Reward maximization in general dynamic matching systems

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Matching Behavior as a Tradeoff Between Reward Maximization

When faced with a choice, humans and animals commonly distribute their behavior in proportion to the frequency of payoff of each option. Such behavior is referred to as matching and has been captured by the matching law. However, matching is not a general law of economic choice. Matching in its strict sense seems to be specifically observed in tasks whose properties make matching an optimal or ...

متن کامل

When Does Reward Maximization Lead to Matching Law?

What kind of strategies subjects follow in various behavioral circumstances has been a central issue in decision making. In particular, which behavioral strategy, maximizing or matching, is more fundamental to animal's decision behavior has been a matter of debate. Here, we prove that any algorithm to achieve the stationary condition for maximizing the average reward should lead to matching whe...

متن کامل

Time-based reward maximization.

Humans and animals time intervals from seconds to minutes with high accuracy but limited precision. Consequently, time-based decisions are inevitably subjected to our endogenous timing uncertainty, and thus require temporal risk assessment. In this study, we tested temporal risk assessment ability of humans when participants had to withhold each subsequent response for a minimum duration to ear...

متن کامل

Eye Movements for Reward Maximization

Recent eye tracking studies in natural tasks suggest that there is a tight link between eye movements and goal directed motor actions. However, most existing models of human eye movements provide a bottom up account that relates visual attention to attributes of the visual scene. The purpose of this paper is to introduce a new model of human eye movements that directly ties eye movements to the...

متن کامل

Reward Maximization Under Uncertainty: Leveraging Side-Observations Reward Maximization Under Uncertainty: Leveraging Side-Observations on Networks

We study the stochastic multi-armed bandit (MAB) problem in the presence of sideobservations across actions that occur as a result of an underlying network structure. In our model, a bipartite graph captures the relationship between actions and a common set of unknowns such that choosing an action reveals observations for the unknowns that it is connected to. This models a common scenario in on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Queueing Systems

سال: 2018

ISSN: 0257-0130,1572-9443

DOI: 10.1007/s11134-018-9593-y